Main focus of this another side project is to find effects of
Acceptability of different opinion,
Narrowness of identity group,
Opinion dimensions,
Number of polarized opinions and
Polarization distance at the start. Result should be a
simple graph showing how ESBG polarization changes with change of
mentioned variables.
Note: Experiment is still running, we are at 30 complete sets of all values combinations out of 240. Also note, that it seems that we are not done yet.
## Now we need to run it, since experiment is still running, but later, after data finalization, we might comment this out:
polarStart = read_csv("polarizedPart72.csv.csv", skip = 6) %>%
select(2:7, neis = 14, step = 50, start = ESBSG_polarization_start, final = ESBSG_polarization_final) %>%
add_row(read_csv("polarizedPart73.csv.csv", skip = 6) %>%
select(2:7, neis = 14, step = 50, start = ESBSG_polarization_start, final = ESBSG_polarization_final)) %>%
add_row(read_csv("polarizedPart74.csv.csv", skip = 6) %>%
select(2:7, neis = 14, step = 50, start = ESBSG_polarization_start, final = ESBSG_polarization_final)) %>%
add_row(read_csv("polarizedPart75.csv.csv", skip = 6) %>%
select(2:7, neis = 14, step = 50, start = ESBSG_polarization_start, final = ESBSG_polarization_final)) %>%
add_row(read_csv("polarizedPart76.csv.csv", skip = 6) %>%
select(2:7, neis = 14, step = 50, start = ESBSG_polarization_start, final = ESBSG_polarization_final)) %>%
add_row(read_csv("polarizedPart77.csv.csv", skip = 6) %>%
select(2:7, neis = 14, step = 50, start = ESBSG_polarization_start, final = ESBSG_polarization_final))
save(polarStart, file = "polarStart.RData")
Data are at https://github.com/frantisek901/Spirals/tree/master/PolarizedStart.
Experiment is still running and I, FranČesko, from time to time
actualize the *.csv files at GitHub, then I run script
experiment.R which loads the data. Now, 2022-03-27, we are
at 12.5 %, roughly. Who is not interested in working with megabytes of
*.csv files, might use compiled
phase2w.RData.
Now we load and aggregate these data and factorize and rename selected variables:
## Loading stored data
load("polarStart.RData")
## Firstly, we have to find, what is the highest complete RS, i.e. set of all parameters' combinations simulated:
RS_complete = (polarStart %>%
filter(neis == 52) %>% # For now, we have filter out simulations with narrow neighborhood
group_by(RS) %>% summarise(n = n()) %>% filter(n == max(n)))$RS
## Preparing individual data 'dfi'
dfi = polarStart %>%
## Renaming vars:
prejmenuj(3:6, c("acceptability", "narrowness", "pol_ops", "pol_dist")) %>%
## Filtering observations:
filter(neis == 52) %>% # For now, we have only few simulations with narrow neighborhood, so we filter them out.
filter(RS %in% RS_complete) %>%
## Denormalizing ESBG:
mutate(start = start * sqrt(opinions), final = final * sqrt(opinions), change = final - start,
change_cat = case_when(
change <= -0.2 ~ "Decrease",
change > -0.2 & change <= -0.1 ~ "Slight decrease",
change < +0.2 & change >= +0.1 ~ "Slight increase",
change >= +0.2 ~ "Increase",
TRUE ~ "No big change"
) %>% factor(levels = c("Decrease", "Slight decrease", "No big change", "Slight increase", "Increase")))
## We don't need now the loaded original full data:
# rm(polarStart)
## Summarising 'dfi' into 'dfs':
dfs = dfi %>%
group_by(opinions, acceptability, narrowness, pol_ops, pol_dist) %>%
summarise(start = mean(start), final = mean(final), change = mean(change)) %>% ungroup() %>%
mutate(across(start:change, ~round(.x, 2)))
## Renaming variables according 2022-03-18 meeting:
# prejmenuj(1:4, c("Opinion dimensions:", "Acceptability of different opinion:", "Identity:",
# "Narrowness of identity group:"))
I de-normalized ESBG, i.e. I multiply. I just noticed that systematically ESBG is lower and also much denser in higher dimensions. I have also substantive/philosophical reasons for this de-normalization, just briefly:
I think that agents do not know in how many dimensions they are and what is the maximum posible distance, they feel polarisation reegarding the other group not regarding the group and the possible maxima of distance, let’s do following thought experiment:
Our agents living in 1D, they discuss just one topic, they are divided in two camps of equal size and these two camps are at the poles -1 and +1 of their opinion space, the polarization is maximal, ESBG is 1. Then we take this strange world on a string and put it on the table, now they are in 2D world, their distance is same since the don’t change it, they should stil feel polarization of margin ESBG=1 since nothing changed. Then we recognize that table is in the roomm – 3D, then we rocignize time – 4D… But polarization should be still same, since these agent don’t change their positions.
Now, let’s show our results graphically!
dfi %>%
ggplot() +
aes(x = start) +
geom_histogram(color = "steelblue", fill = "steelblue", alpha = .5) +
labs(title ="Polarization at the start of simulation") +
theme_minimal()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
dfi %>%
ggplot() +
aes(x = final) +
geom_histogram(color = "steelblue", fill = "steelblue", alpha = .5) +
labs(title ="Polarization at the end of simulation") +
theme_minimal()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
dfi %>%
ggplot() +
aes(x = change, fill = change_cat, col = change_cat ) +
geom_histogram( alpha = .5, breaks = seq(-0.85, 0.95, 0.05 ) ) +
labs(title ="Change of polarization between the start and the end of simulation") +
theme_minimal() +
theme(legend.position = "bottom")
dfi %>% frq(change_cat, show.na = FALSE) %>% kable()
|
OK, so we see that polarization rather increases than decreases. Decrease higher than 0.1 (resp. change lower than -0.1) happens in 25% of simulations. So, it happens, but less frequent than polarization increase (increase higher than +0.1 happens in 40.3% of simulations). We might conclude that polarization increase happens generally more frequent. BUT :-) we have to determine the conditions of increase and decrease, as well. I hope that the color maps help us to make initial exploration:
dfs %>%
ggplot() +
aes(x = acceptability, col = change, fill = change, label = round(100*change, 0),
y = narrowness) +
facet_wrap(vars(pol_dist, pol_ops, opinions), ncol=6) +
geom_point(alpha = 1, size = 12.5, shape = 22) +
geom_text(color = "white", size = 3) +
scale_fill_gradient2(low = "blue", mid = "yellow", high = "black", midpoint = 0) +
scale_color_gradient2(low = "blue", mid = "yellow", high = "black", midpoint = 0) +
scale_y_continuous(breaks = seq(0.05, 0.85, 0.1)) +
scale_x_continuous(breaks = seq(0.15, 0.45, 0.1)) +
labs(title = "Change of polarization in simulations according: \n'Narrowness of id. grp.' (0.05--0.85) and 'Acceptability of diff. ops' (0.05--0.5) and by:\n'Distance of polarized groups' (0.35, 0.7, 1),\n'Polarized opinions' (1, 2, 3) and\n 'Opinion dimensions' (2, 3, 4)",
x = "Average acceptability of different opinions") +
guides(alpha = "none") +
theme_light() +
theme(legend.position = "top")
Polarized groups distance = 1,
Polarized opinions = 3, Opinions = 4) is
logical: simulations are highly polarized at the start here, so there is
a big potential for big decrease – if the Narrowness is low
and Acceptability is high, then some agents create a bridge
between polarized groups, groups merge and potential is realized, but
with slightly lower Acceptability or slightly higher
Narrowness the groups preserve/defend themselves, no bridge
is build, each group also unifies in the last non-polarized opinion and
as a result the overall polarization increases,Acceptability is sou low or Narrowness is so
high that no change is possible,Let’s now do the same maps, but for the initial and final state of polarization.
dfs %>%
ggplot() +
aes(x = acceptability, col = start, fill = start, label = round(100*start, 0),
y = narrowness) +
facet_wrap(vars(pol_dist, pol_ops, fct_rev(factor(opinions))), ncol=6) +
geom_point(alpha = 1, size = 12.5, shape = 22) +
geom_text(color = "white", size = 3) +
scale_fill_gradient2(low = "blue", mid = "yellow", high = "black", midpoint = 0) +
scale_color_gradient2(low = "blue", mid = "yellow", high = "black", midpoint = 0) +
scale_y_continuous(breaks = seq(0.05, 0.85, 0.1)) +
scale_x_continuous(breaks = seq(0.15, 0.45, 0.1)) +
labs(title = "Average polarization of initial state of simulations according: \n'Narrowness of id. grp.' (0.05--0.85) and 'Acceptability of diff. ops' (0.05--0.5) and by:\n'Distance of polarized groups' (0.35, 0.7, 1),\n'Polarized opinions' (1, 2, 3) and\n 'Opinion dimensions' (2, 3, 4)",
x = "Average acceptability of different opinions") +
guides(alpha = "none") +
theme_light() +
theme(legend.position = "top")
dfs %>%
ggplot() +
aes(x = acceptability, col = final, fill = final, label = round(100*final, 0),
y = narrowness) +
facet_wrap(vars(pol_dist, pol_ops, fct_rev(factor(opinions))), ncol=6) +
geom_point(alpha = 1, size = 12.5, shape = 22) +
geom_text(color = "white", size = 3) +
scale_fill_gradient2(low = "blue", mid = "yellow", high = "black", midpoint = 0.4) +
scale_color_gradient2(low = "blue", mid = "yellow", high = "black", midpoint = 0.4) +
scale_y_continuous(breaks = seq(0.05, 0.85, 0.1)) +
scale_x_continuous(breaks = seq(0.15, 0.45, 0.1)) +
labs(title = "Average final polarization according: \n'Narrowness of id. grp.' (0.05--0.85) and 'Acceptability of diff. ops' (0.05--0.5) and by:\n'Distance of polarized groups' (0.35, 0.7, 1),\n'Polarized opinions' (1, 2, 3) and\n 'Opinion dimensions' (2, 3, 4)",
caption = "Note: We set mid-point and yellow color for ESBG polarization value 0.4 what is the overall mean of initial polarization,\nthen the black color shows which simulations ended more polarized than was overall mean of initial polarization.",
x = "Average acceptability of different opinions") +
guides(alpha = "none") +
theme_light() +
theme(legend.position = "top")
## For presenting variability we try now boxplots on individual data (non-aggregated):
dfi %>%
# filter(round(100 * narrowness, 0) %in% seq(5, 85, 10)) %>%
# sample_n(2000) %>%
## Selecting variables:
# select(opinions, boundary, id_threshold, ESBG) %>%
mutate(acceptability = factor(acceptability),
opinions = factor(opinions),
grouper = paste(pol_dist, acceptability, pol_ops, opinions, sep = "; ")) %>%
# ## Renaming variables according 2022-03-18 meeting:
# prejmenuj(1:3, c("Opinion dimensions:", "Acceptability of different opinion:",
# "Narrowness of identity group:")) %>%
## Graph itself:
ggplot() +
aes(fill = acceptability, y = final,
x = narrowness,
group = narrowness,
col = acceptability) +
facet_wrap(vars(grouper), ncol=6) +
geom_boxplot(alpha = 0.2) +
geom_jitter(alpha = 0.2) +
scale_x_continuous(breaks = seq(0.05, 0.850, 0.20)) +
labs(title = "Final polarization in simulations by 'Opinion dimensions' (2, 3, 4),\n'Narrowness of identity group' (0.05--0.85), 'Average acceptability of different opinions' (0.05--0.5),\n'Polarized opinions' (1, 2, 3) and 'Distance of polarized groups' (0.35, 0.7, 1)",
x = "Narrownes of identity group", y = "Polarization") +
guides(fill = "none", color = "none") +
theme_light() +
theme(legend.position = "top")
Same data, slightly different graphics:
## For presenting variability we try now boxplots on individual data (non-aggregated):
dfi %>%
# filter(round(100 * narrowness, 0) %in% seq(5, 85, 10)) %>%
# sample_n(2000) %>%
## Selecting variables:
# select(opinions, boundary, id_threshold, ESBG) %>%
mutate(acceptability = factor(acceptability),
opinions = factor(opinions),
grouper = paste(pol_dist, acceptability, pol_ops, opinions, sep = "; ")) %>%
# ## Renaming variables according 2022-03-18 meeting:
# prejmenuj(1:3, c("Opinion dimensions:", "Acceptability of different opinion:",
# "Narrowness of identity group:")) %>%
## Graph itself:
ggplot() +
aes(fill = acceptability, y = final,
x = narrowness,
group = narrowness,
col = acceptability) +
facet_wrap(vars(grouper), ncol=6) +
geom_point(alpha = 0.05) +
scale_x_continuous(breaks = seq(0.05, 0.850, 0.20)) +
labs(title = "Final polarization in simulations by 'Opinion dimensions' (2, 3, 4),\n'Narrowness of identity group' (0.05--0.85), 'Average acceptability of different opinions' (0.05--0.5),\n'Polarized opinions' (1, 2, 3) and 'Distance of polarized groups' (0.35, 0.7, 1)",
x = "Narrownes of identity group", y = "Polarization") +
guides(fill = "none", color = "none") +
theme_light() +
theme(legend.position = "top")
Acceptability and
Narrowness## For presenting variability we try now boxplots on individual data (non-aggregated):
dfi %>%
# filter(round(100 * narrowness, 0) %in% seq(5, 85, 10)) %>%
# sample_n(2000) %>%
## Selecting variables:
# select(opinions, boundary, id_threshold, ESBG) %>%
mutate(narrowness = factor(narrowness),
opinions = factor(opinions),
grouper = paste(pol_dist, narrowness, pol_ops, opinions, sep = "; ")) %>%
# ## Renaming variables according 2022-03-18 meeting:
# prejmenuj(1:3, c("Opinion dimensions:", "Acceptability of different opinion:",
# "Narrowness of identity group:")) %>%
## Graph itself:
ggplot() +
aes(x = acceptability, y = final,
fill = narrowness,
col = narrowness,
group = acceptability) +
facet_wrap(vars(grouper), ncol=6) +
geom_boxplot(alpha = 0.2) +
geom_jitter(alpha = 0.2) +
scale_x_continuous(breaks = seq(0.05, 0.850, 0.20)) +
labs(title = "Final polarization in simulations by 'Opinion dimensions' (2, 3, 4),\n'Narrowness of identity group' (0.05--0.85), 'Average acceptability of different opinions' (0.05--0.5),\n'Polarized opinions' (1, 2, 3) and 'Distance of polarized groups' (0.35, 0.7, 1)",
x = "Acceptability of different opinion", y = "Polarization") +
guides(fill = "none", color = "none") +
theme_light() +
theme(legend.position = "top")
Same data, slightly different graphics:
## For presenting variability we try now boxplots on individual data (non-aggregated):
dfi %>%
# filter(round(100 * narrowness, 0) %in% seq(5, 85, 10)) %>%
# sample_n(2000) %>%
## Selecting variables:
# select(opinions, boundary, id_threshold, ESBG) %>%
mutate(narrowness = factor(narrowness),
opinions = factor(opinions),
grouper = paste(pol_dist, narrowness, pol_ops, opinions, sep = "; ")) %>%
# ## Renaming variables according 2022-03-18 meeting:
# prejmenuj(1:3, c("Opinion dimensions:", "Acceptability of different opinion:",
# "Narrowness of identity group:")) %>%
## Graph itself:
ggplot() +
aes(x = acceptability, y = final,
fill = narrowness,
col = narrowness,
group = acceptability) +
facet_wrap(vars(grouper), ncol=6) +
geom_point(alpha = 0.05) +
scale_x_continuous(breaks = seq(0.05, 0.850, 0.20)) +
labs(title = "Final polarization in simulations by 'Opinion dimensions' (2, 3, 4),\n'Narrowness of identity group' (0.05--0.85), 'Average acceptability of different opinions' (0.05--0.5),\n'Polarized opinions' (1, 2, 3) and 'Distance of polarized groups' (0.35, 0.7, 1)",
x = "Acceptability of different opinion", y = "Polarization") +
guides(fill = "none", color = "none") +
theme_light() +
theme(legend.position = "top")
mc = lm(change ~ start+factor(opinions)+factor(pol_ops)+factor(pol_dist)+factor(narrowness)+factor(acceptability), data = dfi)
mcs = summary(mc)
pc = lm(change ~ start+factor(opinions)*factor(pol_ops)*factor(pol_dist)+factor(narrowness)+factor(acceptability), data = dfi)
pcs = summary(pc)
fc = lm(change ~ start+factor(opinions)*factor(pol_ops)*factor(pol_dist)*factor(narrowness)*factor(acceptability), data = dfi)
fcs = summary(fc)
sf = lm(final ~ start, data = dfi)
sfs = summary(sf)
mf = lm(final ~ start+factor(opinions)+factor(pol_ops)+factor(pol_dist)+factor(narrowness)+factor(acceptability), data = dfi)
mfs = summary(mf)
pf = lm(final ~ start+factor(opinions)*factor(pol_ops)*factor(pol_dist)+factor(narrowness)+factor(acceptability), data = dfi)
pfs = summary(pf)
ff = lm(final ~ start+factor(opinions)*factor(pol_ops)*factor(pol_dist)*factor(narrowness)*factor(acceptability), data = dfi)
ffs = summary(ff)
Fun/Interesting fact first: Polarization of initial state itself explains 63.3% of variability of polarization of final state. The fully factorized basic model (without interactions) explains 72.4% of variability and despite consumation of a lot more degrees of freedom (DF), BIC signalizes that this more complicated model is really better and that DF worth of consumation (BIC(more complicated) - BIC(super simple) = -16k).
I interpret it in following way: While the initial polarization determines final polarization to huge degree, still the context (number of opinion dimensions, number of polarized opinions, distance of polarized groups, narrowness of identity group) and individual traits (acceptability of different opinion) are able to predict deviations from the main trend determined by the initial polarization.
I just wanna know how much variability we can explain by the full model. And I am absolutely surprised and excited! \(R^2\)s are absolutely high (in case of full model), variable describing initial polarization doesn’t improve the model predicting change that much, but really improves model predicting final polarization (how is it possible, that description of initial polarization doesn’t help with predicting change, but helps to predict polarization of final state? I have only one explanation: change is computed as difference between final and initial polarization, so the change contains somehow the information on the initial state… wait a minute! In case the parameters predict initial polarization with high precision, then adding also initial polarization is obsolete…)… Hmm… I’ll try something…
ms = lm(start ~ factor(opinions)+factor(pol_ops)+factor(pol_dist)+factor(narrowness)+factor(acceptability), data = dfi)
mss = summary(ms)
ps = lm(start ~ factor(opinions)*factor(pol_ops)*factor(pol_dist)+factor(narrowness)+factor(acceptability), data = dfi)
pss = summary(ps)
fs = lm(start ~ factor(opinions)*factor(pol_ops)*factor(pol_dist)*factor(narrowness)*factor(acceptability), data = dfi)
fss = summary(fs)
BINGO! Also the basic model with
fully factorized main effects only predicts the initial
polarization ve well (81.2% of variability)!!! The best model here is
fully factorized with selected interactions (93.9% of
variability)!!! Resp. full model is better in explaining
variability (93.9% of variability!!!), but BIC tell us the improvement
doesn’t worth the degrees of freedom dedicated for such a precision
prediction (BIC(full) - BIC(selected interactions) = 8548.5768173). So,
this is the prediction of polarization of initial state. Let’s get back
to change and final state.
In both cases full model is the best, the increase of
\(R^2\) is so big that makes no sense
also check BIC. By the full model we might explain 85.8 %
of variability of polarization change instead of 41.2 % of
variability explained by model
fully factorized with selected interactions.
Regarding polarization of final state the performance of
full model is even better, we might explain by it 93.5 % of
variability instead of 72.9 % of variability explained by model
fully factorized with selected interactions.
Why the prediction of polarization is like that? We might easily predict initial polarization in polarized scenarios, we might predict the final polarization, we migh quite good predict the change in polarized scenarios, but it is so hard to predict the final polarization of simulations starting with random initial conditions.
Now I see the only explanation: the answer is randomness of initial conditions! I have to test it properly later, but now it seems to me that from random initial conditions the results could be very diverse. But (high) initial polarization controls the course of simulation, so we might much easier predict the final polarization and change.